36 research outputs found

    Comparing yeast genome assemblies

    Get PDF
    The recent transition to Next-Generation Sequencing technology has accelerated the growth of genome projects exponentially. This explosion includes a multitude of species with different strains/individuals being sequenced and made available to the scientific community. As time passes, errors in genome assemblies are also being discovered and corrected. Biologists need to update their working assembly to a newer version or to convert between different strains or species for comparisons. The LiftOver utility in the UCSC Genome Browser handles these tasks with ease. Unfortunately, the choice for yeast genome conversions is limited. Here, I extend the capabilities of LiftOver by developing applications that generate the chain files required by LiftOver in an efficient way. These files are then utilised by a website that I built to allow conversion between assemblies, strains, or species of yeast using LiftOver. Also, I used R to produce dot-matrix plots of sequence alignment for rapid comparative analysis of a new genome sequence. One important aspect of genome biology is the characterisation of the replication start sites, called DNA replication origin. Studies with confirmed and predicted replication origin locations, specifically in budding yeast Saccharomyces cerevisiae, are collated in a database (OriDB). However, the structure of OriDB is complex to maintain and currently includes just a description of S. cerevisiae replication origins. Here, I revamp the OriDB website and database to be future-proof so that additional studies or species can be added to the database without difficulties and maintenance can be carried out with ease. The database will also include data of Schizosaccharomyces pombe replication origins

    Comparing yeast genome assemblies

    Get PDF
    The recent transition to Next-Generation Sequencing technology has accelerated the growth of genome projects exponentially. This explosion includes a multitude of species with different strains/individuals being sequenced and made available to the scientific community. As time passes, errors in genome assemblies are also being discovered and corrected. Biologists need to update their working assembly to a newer version or to convert between different strains or species for comparisons. The LiftOver utility in the UCSC Genome Browser handles these tasks with ease. Unfortunately, the choice for yeast genome conversions is limited. Here, I extend the capabilities of LiftOver by developing applications that generate the chain files required by LiftOver in an efficient way. These files are then utilised by a website that I built to allow conversion between assemblies, strains, or species of yeast using LiftOver. Also, I used R to produce dot-matrix plots of sequence alignment for rapid comparative analysis of a new genome sequence. One important aspect of genome biology is the characterisation of the replication start sites, called DNA replication origin. Studies with confirmed and predicted replication origin locations, specifically in budding yeast Saccharomyces cerevisiae, are collated in a database (OriDB). However, the structure of OriDB is complex to maintain and currently includes just a description of S. cerevisiae replication origins. Here, I revamp the OriDB website and database to be future-proof so that additional studies or species can be added to the database without difficulties and maintenance can be carried out with ease. The database will also include data of Schizosaccharomyces pombe replication origins

    Development of ListeriaBase and comparative analysis of Listeria monocytogenes

    Get PDF
    Background: Listeria consists of both pathogenic and non-pathogenic species. Reports of similarities between the genomic content between some pathogenic and non-pathogenic species necessitates the investigation of these species at the genomic level to understand the evolution of virulence-associated genes. With Listeria genome data growing exponentially, comparative genomic analysis may give better insights into evolution, genetics and phylogeny of Listeria spp., leading to better management of the diseases caused by them. Description: With this motivation, we have developed ListeriaBase, a web Listeria genomic resource and analysis platform to facilitate comparative analysis of Listeria spp. ListeriaBase currently houses 850,402 protein-coding genes, 18,113 RNAs and 15,576 tRNAs from 285 genome sequences of different Listeria strains. An AJAX-based real time search system implemented in ListeriaBase facilitates searching of this huge genomic data. Our in-house designed comparative analysis tools such as Pairwise Genome Comparison (PGC) tool allowing comparison between two genomes, Pathogenomics Profiling Tool (PathoProT) for comparing the virulence genes, and ListeriaTree for phylogenic classification, were customized and incorporated in ListeriaBase facilitating comparative genomic analysis of Listeria spp. Interestingly, we identified a unique genomic feature in the L. monocytogenes genomes in our analysis. The Auto protein sequences of the serotype 4 and the non-serotype 4 strains of L. monocytogenes possessed unique sequence signatures that can differentiate the two groups. We propose that the aut gene may be a potential gene marker for differentiating the serotype 4 strains from other serotypes of L. monocytogenes. Conclusions: ListeriaBase is a useful resource and analysis platform that can facilitate comparative analysis of Listeria for the scientific communities. We have successfully demonstrated some key utilities of ListeriaBase. The knowledge that we obtained in the analyses of L. monocytogenes may be important for functional works of this human pathogen in future. ListeriaBase is currently available at http://listeria.um.edu.my

    Development of ListeriaBase and comparative analysis of \u3ci\u3eListeria monocytogenes\u3c/i\u3e

    Get PDF
    Background: Listeria consists of both pathogenic and non-pathogenic species. Reports of similarities between the genomic content between some pathogenic and non-pathogenic species necessitates the investigation of these species at the genomic level to understand the evolution of virulence-associated genes. With Listeria genome data growing exponentially, comparative genomic analysis may give better insights into evolution, genetics and phylogeny of Listeria spp., leading to better management of the diseases caused by them. Description: With this motivation, we have developed ListeriaBase, a web Listeria genomic resource and analysis platform to facilitate comparative analysis of Listeria spp. ListeriaBase currently houses 850,402 protein-coding genes, 18,113 RNAs and 15,576 tRNAs from 285 genome sequences of different Listeria strains. An AJAX-based real time search system implemented in ListeriaBase facilitates searching of this huge genomic data. Our in-house designed comparative analysis tools such as Pairwise Genome Comparison (PGC) tool allowing comparison between two genomes, Pathogenomics Profiling Tool (PathoProT) for comparing the virulence genes, and ListeriaTree for phylogenic classification, were customized and incorporated in ListeriaBase facilitating comparative genomic analysis of Listeria spp. Interestingly, we identified a unique genomic feature in the L. monocytogenes genomes in our analysis. The Auto protein sequences of the serotype 4 and the non-serotype 4 strains of L. monocytogenes possessed unique sequence signatures that can differentiate the two groups. We propose that the aut gene may be a potential gene marker for differentiating the serotype 4 strains from other serotypes of L. monocytogenes. Conclusions: ListeriaBase is a useful resource and analysis platform that can facilitate comparative analysis of Listeria for the scientific communities. We have successfully demonstrated some key utilities of ListeriaBase. The knowledge that we obtained in the analyses of L. monocytogenes may be important for functional works of this human pathogen in future. ListeriaBase is currently available at http://listeria.um.edu.my

    Concern-oriented use case maps

    No full text
    Concern-Oriented Reuse (CORE) is a reuse paradigm that extends model-driven engineering with advanced modularization, goal modeling, and software product lines. Previous work enables modeling with CORE at the design level using Reusable Aspect Models (RAM). Requirements elicitation is also a crucial aspect of software development process, and one of the visual notation that expresses use cases as graphical workflows is Use Case Maps (UCM). UCM bridges the gap between requirements and design, and is part of the User Requirements Notation (URN) for scenario modeling. This thesis addresses the need for enabling scenario modeling in CORE. Based on Aspect-Oriented Use Case Maps (AoUCM), we introduce a novel technique that applies advanced separation of concerns for model-driven requirements elicitation with use cases—Concern-Oriented Use Case Maps (CoUCM). We define a metamodel for CoUCM that derives from the CORE metamodel, and formulate the weaving algorithm for CoUCM model composition. We then implement CoUCM in the TouchCORE tool as proof of concept. We present a working application of scenario modeling with TouchCORE, in which we further validate our implementation through case studies and workflow patterns. Validation shows that CoUCM is able to model certain requirements concerns in a reusable way, and that they can then easily be applied to different reuse contexts.Concern-Oriented Reuse (CORE) est une nouvelle approche de réutilisation qui s'appuie sur l'ingénierie dirigée par les modèles (IDM), la modularisation avancée, la modélisation des besoins, et les lignes de produits logiciels. Jusqu'à présent, CORE permettait seulement la modélisation de conceptions avec le langage de modélisation Reusable Aspect Models (RAM). Cependant, la modélisation des besoins est une partie fondamentale de l'approche IDM. Use Case Maps (UCM) est un langage de modélisation qui permet d'exprimer les cas d'utilisations visuellement en tant que flux de travails. UCM permets de formaliser les cas d'utilisations textuels et donc informels, et ainsi passer plus facilement des modèles de besoins au modèles de concéption. Ce travail de thèse intègre UCM avec l'approche de réutilisation CORE. En s'inspirant de la téchnique Aspect-Oriented Use Case Maps (AoUCM), nous proposons une extension de UCM orienté-préoccupations nommée Concern-Oriented Use Case Maps (CoUCM). Nous définissons un meta modèle pour CoUCM qui s'intègre avec le méta modèle de CORE, et nous proposons des algorithmes de tissage qui permettent la composition de modèles UCM venant de différents préoccupations ou charactéristiques. Une implémentation prototype de CoUCM dans l'outil de modélisation TouchCORE est également décrite. Pour valider l'expressivité de notre solution, nous montrons comment CoUCM s'applique à plusieurs cas d'utilisation, qui démontrent que CoUCM permet de modulariser et rendre réutilisable certaines préoccupations, tel que par exemple les différents intéractions de payment

    Comparative Genomic Analysis Reveals a Possible Novel Non-Tuberculous Mycobacterium Species with High Pathogenic Potential

    No full text
    <div><p>Mycobacteria have been reported to cause a wide range of human diseases. We present the first whole-genome study of a Non-Tuberculous <i>Mycobacterium</i>, <i>Mycobacterium</i> sp. UM_CSW (referred to hereafter as UM_CSW), isolated from a patient diagnosed with bronchiectasis. Our data suggest that this clinical isolate is likely a novel mycobacterial species, supported by clear evidence from molecular phylogenetic, comparative genomic, ANI and AAI analyses. UM_CSW is closely related to the <i>Mycobacterium avium</i> complex. While it has characteristic features of an environmental bacterium, it also shows a high pathogenic potential with the presence of a wide variety of putative genes related to bacterial virulence and shares very similar pathogenomic profiles with the known pathogenic mycobacterial species. Thus, we conclude that this possible novel <i>Mycobacterium</i> species should be tightly monitored for its possible causative role in human infections.</p></div

    Phylogenetic relationship of UM_CSW with other mycobacterial species.

    No full text
    <p>The phylogenetic tree was generated using core genome SNPs and the maximum likelihood method. Bootstrap numbers were generated in 1,000 runs. Nodes with bootstrap support values are indicated.</p

    ANI analysis.

    No full text
    <p>The top six relatives of <i>Mycobacterium</i> sp. UM_CSW are all members of the <i>M</i>. <i>avium</i> complex.</p
    corecore